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Abstract 



In an important paper, M.E.J. Newman claimed that a general network-based 
stochastic Susceptible-Infectious-Removed (SIR) epidemic model is isomorphic 
to a bond percolation model, where the bonds are the edges of the contact net- 
work and the bond occupation probability is equal to the marginal probability 
of transmission from an infected node to a susceptible neighbor. In this paper, 
we show that this isomorphism is incorrect and define a semi-directed random 
network we call the epidemic percolation network that is exactly isomorphic 
to the SIR epidemic model in any finite population. In the limit of a large 
population, (i) the distribution of (self-limited) outbreak sizes is identical to 
the size distribution of (small) out-components, (ii) the epidemic threshold cor- 
responds to the phase transition where a giant strongly-connected component 
appears, (iii) the probability of a large epidemic is equal to the probability that 
an initial infection occurs in the giant in-component, and (iv) the relative final 
size of an epidemic is equal to the proportion of the network contained in the 
giant out-component. For the SIR model considered by Newman, we show that 
the epidemic percolation network predicts the same mean outbreak size below 
the epidemic threshold, the same epidemic threshold, and the same final size 
of an epidemic as the bond percolation model. However, the bond percolation 
model fails to predict the correct outbreak size distribution and probability of 
an epidemic when there is a nondcgencratc infectious period distribution. We 
confirm our findings by comparing predictions from percolation networks and 
bond percolation models to the results of simulations. In an appendix, we show 
that an isomorphism to an epidemic percolation network can be defined for any 
time-homogeneous stochastic SIR model. 



1 Introduction 

In an important paper, M. E. J. Newman studied a network-based Susceptible- 
Infcctious-Removed (SIR) epidemic model in which infection is transmitted 
through a network of contacts between individuals [1]. The contact network 
itself is a random undirected network with an arbitrary degree distribution of 
the form studied by Newman, Strogatz, and Watts [2]. Given the degree dis- 
tribution, these networks are maximally random, so they have no small loops 
and no degree correlations in the limit of a large population [2-4] . 

In the stochastic SIR model considered by Newman, the probability that an 
infected node i makes infectious contact with a neighbor j is given by = 
1 — cxp(— flijTi), where /3y is the rate of infectious contact from i to j and 
Tj is the time that i remains infectious. (We use infectious contact to mean 
a contact that results in infection if and only if the recipient is susceptible.) 
The infectious period Tj is a random variable with the cumulative distribution 
function (cdf) F(t), and the infectious contact rate /3y is a random variable 
with the cdf F(fi). The infectious periods for all individuals are independent 
and identically distributed (iid), and the infectious contact rates for all ordered 
pairs of individuals are iid. 

Under these assumptions, Newman claimed that the spread of disease on 
the contact network is exactly isomorphic to a bond percolation model on the 
contact network with bond occupation probability equal to the a priori proba- 
bility of disease transmission between any two connected nodes in the contact 
network [1]. This probability is called the transmissibility and denoted by T: 



Newman used this bond percolation model to derive the distribution of finite 
outbreak sizes, the critical transmissibility T c that defines the epidemic (i.e., 
percolation) threshold, and the probability and relative final size of an epidemic 
(i.e., an outbreak that never goes extinct). 

As a counterexample, consider a contact network where each subject has 
exactly two contacts. Assume that (i) Tj = r > with probability p and 
Tj = with probability 1 — p and (ii) /3,j = (3q > with probability one for 
all ij. Under the SIR model, the probability that the infection of a randomly 
chosen node results in an outbreak of size one is p\ = 1 — p+pe~ 2l3 ° T ° , which is the 
sum of the probability 1—p that r = and the probability pe~ 2/3 ° ro that r = To 
and disease is not transmitted to either contact. Under the bond percolation 
model, the probability of a cluster of size one is p b ° nd = (l — p + pe~ l3 ° T °) 2 , 
corresponding to the probability that neither of the bonds incident to the node 
are occupied. Since 



the bond percolation model correctly predicts the probability of an outbreak of 
size one only if p = or p = 1. When the infectious period is not constant, 
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it underestimates this probability. The supremum of the error is 0.25, which 
occurs when p = 0.5 and t — > oo. In this limit, the SIR model corresponds to 
a site percolation model rather than a bond percolation model. 

When the distribution of infectious periods is nondegenerate, there is no 
bond occupation probability that will make the bond percolation model isomor- 
phic to the SIR model. To see why, suppose node i has infectious period n 
and degree rij in the contact network. In the epidemic model, the conditional 
probability that i transmits infection to a neighbor j in the contact network 
given n is 



Since the contact rate pairs for all rii edges incident to i are iid, the transmission 
events across these edges are (conditionally) independent Bernoulli (T Ti ) random 
variables; but the transmission probabilities are strictly increasing in n, so the 
transmission events are (marginally) dependent unless n — tq with probability 
one for some fixed tq. In contrast, the bond percolation model treats the infec- 
tions generated by node i as rij (marginally) independent Bernoulli (T) random 
variables regardless of the distribution of r». Neither counterexample assumes 
anything about the global properties of the contact network, so Newman's claim 
cannot be justified as an approximation in the limit of a large network with no 
small loops. 

In Section 2, we define a semi-directed random network called the epidemic 
percolation network and show how it can be used to predict the outbreak size 
distribution, the epidemic threshold, and the probability and final size of an 
epidemic in the limit of a large population for any time-homogeneous SIR model. 
In Section 3, we show that the network-based stochastic SIR model from [1] 
can be analyzed correctly using a semi-directed random network of the type 
studied by Boguna and Serrano [3]. In Section 4, we show that it predicts 
the same epidemic threshold, mean outbreak size below the epidemic threshold, 
and relative final size of an epidemic as the bond percolation model. In Section 
5, we show that the bond percolation model fails to predict the distribution 
of outbreak sizes and the probability of an epidemic when the distribution of 
infectious periods is nondegenerate. In Section 6, we compare predictions made 
by epidemic percolation networks and bond percolation models to the results 
of simulations. In an appendix, we define epidemic percolation networks for a 
very general time-homogeneous stochastic SIR model and show that their out- 
components are isomorphic to the distribution of possible outcomes of the SIR 
model for any given set of imported infections. 

2 Epidemic percolation networks 

Consider a node i with degree n, in the contact network and infectious period 
n. In the SIR model defined above, the number of people who will transmit 
infection to i if they become infectious has a binomial(ni, T) distribution re- 
gardless of Tj. If i is infected along one of the m edges, then the number of 
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people to whom i will transmit infection has a binomial(rii — 1, T n ) distribution. 
In order to produce the correct joint distribution of the number of people who 
will transmit infection to i and the number of people to whom i will transmit 
infection, we represent the former by directed edges that terminate at i and the 
latter by directed edges that originate at i. Since there can be at most one 
transmission of infection between any two persons, we replace pairs of directed 
edges between two nodes with a single undirected edge. 

Starting from the contact network, a single realization of the epidemic per- 
colation network can be generated as follows: 

1 . Choose a recovery period r, for every node i in the network and choose a 
contact rate for every ordered pair of connected nodes i and j in the 
contact network. 

2. For each pair of connected nodes i and j in the contact network, con- 
vert the undirected edge between them to a directed edge from i to 
j with probability (1 — e~^^ Ti )e~^i iT i , to a directed edge from j to i 
with probability e _ ^ Ti (l — e~ft; T i) ; and erase the edge completely with 
probability e~^ iiTi ~^ jiT:l . The edge remains undirected with probability 
(1 - e-^ Ti ){l - e-^). 

The epidemic percolation network is a semi-directed random network that 
represents a single realization of the infectious contact process for each connected 
pair of nodes, so 4 m possible percolation networks exist for a contact network 
with m edges. The probability of each possible network is determined by the 
underlying SIR model. The epidemic percolation network is very similar to 
the locally dependent random graph defined by Kuulasmaa [5] for an epidemic 
on a d-dimensional lattice. There are two important differences: First, the 
underlying structure of the contact network is not assumed to be a lattice. 
Second, we replace pairs of (occupied) directed edges between two nodes with 
a single undirected edge so that its component structure can be analyzed using 
a generating function formalism. 

In the Appendix, we prove that the size distribution of outbreaks starting 
from any node in a time-homogeneous stochastic SIR model is identical to the 
distribution of its out-component sizes in the corresponding probability space 
of percolation networks. Since this result applies to any time-homogeneous 
SIR model, it can be used to analyze network-based models, fully-mixed models 
(see [6]), and models with multiple levels of mixing. 

2.1 Components of semi-directed networks 

In this section, we briefly review the structure of directed and semi-directed 
networks as discussed in [3,4,7,8]. In the next section, we relate this to the 
possible outcomes of an SIR model. 

The indegree and outdegree of node i are the number of incoming and out- 
going directed edges incident to i. Since each directed edge is an outgoing edge 
for one node and an incoming edge for another node, the mean indegree and 
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outdegree are equal. The undirected degree of node i is the number of undi- 
rected edges incident to i. The size of a component is the number of nodes it 
contains and its relative size is its size divided by the total size of the network. 

The out-component of node i includes i and all nodes that can be reached 
from i by following a series of edges in the proper direction (undirected edges 
are bidirectional). The in- component of node i includes i and all nodes from 
which i can be reached by following a series of edges in the proper direction. 
By definition, node i is in the in-component of node j if and only if j is in the 
out-component of i. Therefore, the mean in- and out-component sizes in any 
(semi-)directed network are equal. 

The strongly- connected component of a node i is the intersection of its in- 
and out-components; it is the set of all nodes that can be reached from node 
i and from which node i can be reached. All nodes in a strongly-connected 
component have the same in-component and the same out-component. The 
weakly- connected component of node i is the set of nodes that are connected to 
i when the direction of the edges is ignored. 

For giant components, we use the definitions given in [8,9]. Giant compo- 
nents have asymptotically positive relative size in the limit of a large population. 
All other components are "small" in the sense that they have asymptotically 
zero relative size. There are two phase transitions in a semi-directed network: 
One where a unique giant weakly-connected component (GWCC) emerges and 
another where unique giant in-, out-, and strongly-connected components (GIN, 
GOUT, and GSCC) emerge. The GWCC contains the other three giant com- 
ponents. The GSCC is the intersection of the GIN and the GOUT, which 
are the common in- and out-components of nodes in the GSCC. Tendrils are 
components in the GWCC that are outside the GIN and the GOUT. Tubes are 
directed paths from the GIN to the GOUT that do not intersect the GSCC. All 
tendrils and tubes are small components. A schematic representation of these 
components is shown in Figure ([1]). 

2.2 Epidemic percolation networks and epidemics 

An outbreak begins when one or more nodes are infected from outside the pop- 
ulation. These are called imported infections. The final size of an outbreak 
is the number of nodes that are infected before the end of transmission, and 
its relative final size is its final size divided by the total size of the network. 
In the epidemic percolation network, the nodes infected in the outbreak can 
be identified with the nodes in the out-components of the imported infections. 
This identification is made mathematically precise in the Appendix. 

Informally, we define a self-limited outbreak to be an outbreak whose relative 
final size approaches zero in the limit of a large population and an epidemic to be 
an outbreak whose relative final size is positive in the limit of a large population. 
There is a critical transmissibility T c that defines the epidemic threshold: The 
probability of an epidemic is zero when T < T c , and the probability and final 
size of an epidemic are positive when T > T c [1, 10-12]. 

If all out-components in the epidemic percolation network are small, then 
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only self-limited outbreaks are possible. If the percolation network contains 
a GSCC, then any infection in the GIN will lead to the infection of the entire 
GOUT. Therefore, the epidemic threshold corresponds to the emergence of the 
GSCC in the percolation network. For any finite set of imported infections, 
the probability of an epidemic is equal to the probability that at least one 
imported infection occurs in the GIN. The relative final size of an epidemic is 
equal to the proportion of the network contained in the GOUT. Although some 
nodes outside the GOUT may be infected (e.g. nodes in tendrils and tubes), 
they constitute a finite number of small components whose total relative size is 
asymptotically zero. 

3 Analysis of the SIR model 

To analyze the SIR model from [1] , we first calculate the probability generating 
function (pgf) of the degree distribution of the corresponding epidemic percola- 
tion network. Then we use methods developed by Boguha and Serrano [3] and 
Meyers et al. [4] to calculate the in- and out-component size distributions and 
the relative sizes of the GIN, GOUT, and GSCC. 

3.1 Degree distribution 

If p n is the probability that a node has degree n in the contact network, then 

oc 

g{z) = Y /Pn z n 

n=l 

is the probability generating function (pgf) for the degree distribution of the 
contact network. If pjkm is the probability that a node in the epidemic perco- 
lation network has j incoming edges, k outgoing edges, and m undirected edges, 
then 

oo oo oo 

G(x, y, u) = Pjk m x 3 y k u m 
j=a k=a m=o 

is the pgf for the degree distribution of the percolation network. Suppose nodes 
i and j are connected in the contact network with contact rates (/3ij,(3ji) and 
infectious periods n and tj. Let g(x,y,u\(3ij , f3ji 1 T i ,Tj) be the conditional pgf 
for the number of incoming, outgoing, and undirected edges incident to i that 
appear between i and j in the percolation network. Then 

g{x,y,u\Pii,Piun,Tj) = c »" T > »» T > + e - fi » T *{l - e"^> 

+ (1 - e-P'^y-^y + (1 - e"^ Ti )(l - e~^)u. 

Given Tj, the conditional pgf for the number of incoming, outgoing, and undi- 
rected edges incident to i that appear in the percolation network between i and 
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any neighbor of % in the contact network is 

f> OO f- OO p oc 

g{x,y,u\Ti)= I I I g{x,y,u\Pij,0ji,T i ,Tj)dF{0i j )dF{fiji)dF{Tj) 
Jo Jo Jo 

= (1 - T n )(l - T) + (1 - T Ti )Ta; + T T( (1 - T)y + T Ti Tu. (3) 
The pgf for the degree distribution of a node with infectious period n is 

OO 

G(x,y,u\n) = ^2p n (g(x,y,u\Ti)) n = g(g(x,y,u\n)). (4) 

Finally, the pgf for the degree distribution of the epidemic percolation network 
is 

/•OO 

G(x,y,u)= G(x,y,u\Ti)dF(ji). (5) 
Jo 

If a, b, and c are nonnegative integers, let G^ a ' b ' c \x,y,u) be the derivative 
obtained after differentiating a times with respect to x, b times with respect to 
y, and c times with respect to u. Then the mean indegree and outdegree of the 
percolation network are 

{k d ) = G^°'°\l, 1, 1) = G^°\l, 1, 1) = T(l - T)e'(l), 

and the mean undirected degree is 

W = G< w )(i,i,i) = rV(i). 



3.2 Generating functions 

When the contact network underlying an SIR epidemic model is an undirected 
random network with an arbitrary degree distribution, the pgf of its degree 
distribution can be used to calculate the distribution of small component sizes, 
the percolation threshold, and the relative sizes of the GIN, GOUT, and GSCC 
using methods developed by Boguha and Serrano [3] and Meyers et al. [4]. 
These methods generalize earlier methods for undirected and purely directed 
networks [1,2,13-16]. In this section, we review these results and introduce 
notation that will be used in the rest of the paper. We discuss the case of 
networks with no two-point degree correlations, which is sufficient to analyze 
the SIR model from [1]. 

Let Gf(x,y,u) be the pgf for the degree distribution of a node reached 
by going forward along a directed edge, excluding the edge used to reach the 
node. Since the probability of reaching any node by following a directed edge 
is proportional to its indegree, 

G f (x,y,u) = -L jPjkm^-Vu" 1 = — |— G^ 1 ' 0,0 ) (x,y,u). (6) 
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Similarly, the pgf for the degree distribution of a node reached by going in 
reverse along a directed edge (excluding the edge used to reach the node) is 

G r (x,y,u) = ^-G^(x,y,u), (7) 

and the pgf for the degree distribution of a node reached by going to the end of 
an undirected edge (excluding the edge used to reach the node) is 

G u (x,y,u) = J-G^'foy.u). (8) 



3.2.1 Out-components 

Let HJ at (z) be the pgf for the size of the out-component at the end of a directed 
edge and H° ut {z) be the pgf for the size of the out-component at the "end" of 
an undirected edge. Then, in the limit of a large population, 

H?*(z) = zG f (l,H° f ^(z),H° u ut (z)), (9a) 
H^(z) = zG u {l,H^{z),H^{z)). (9b) 

The pgf for the out-component size of a randomly chosen node is 

H out (z) = zG(l, H° f ut (z),H° ut {z)). (10) 

The probability that a node has a finite out-component in the limit of a large 
population is H out (l), so the probability that a randomly chosen node is in the 
GIN is 1 - H out {l). 

The coefficients on z° in H° ut (z) and H° ut {z) are G/(l, 0, 0) and G„(l, 0, 0) 
respectively. Therefore, power series for H^ ut (z) and H° ut {z) can be computed 
to any desired order by iterating equations (|9"a|) and (f9b)) . A power series for 
H out (z) can then be obtained using equation (p~0|) . For any z £ [0, 1], Hj ut (z) 
and H° ut (z) can be calculated with arbitrary precision by iterating equations 
(|9ap and (|9b[) starting from initial values yo,uo s [0, 1). Estimates of HJ ut (z) 
and H° ut (z) can be used to estimate H out (z) with arbitrary precision. 

The expected size of the out-component of a randomly chosen node below 
the epidemic threshold is H out '(l). Taking derivatives in (flQ|) yields 

H out, {l) = 1 + ^ H out, {l) + {K) ff °*t'(l). (11) 

Taking derivatives in equations (|9"aj) and (|9b[) and using the fact that Hj ut (l) — 
H° ut (l) = 1 below the epidemic threshold yields a set of linear equations for 
Hf ut '(l) and H° ut '(l). These can be solved to yield 

1 + G (0A1) - G (0A1) 
(1 - G^ ' 1 ' ^! - Gu' ' 1 ^) - G^ ' 0,1 ^! ' 1,0 ' 
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and 

1 _ £,(0,1,0) G (o,i,0) 

TTOUtl/-,\ / " ('['W 

(1 -G^ 0,1 ' 0) )(l -gL°' 0,1) ) -G^' 0,1) g£°' 1,0) ' 
where the argument of all derivatives is (1, 1, 1). 

3.2.2 In-components 

The in-component size distribution of a semi-directed network can be derived 
using the same logic used to find the out-component size distribution, except 
that we consider going backwards along directed edges. Let H* n (z) be the pgf 
for the size of the in-component at the beginning of a directed edge, H™(z) be 
the pgf for the size of the in-component at the "beginning" of an undirected 
edge, and H m (z) be the pgf for the in-component size of a randomly chosen 
node. Then, in the limit of a large population, 

H?(z) = zG r (H; n (z), l,K n (z)), (14a) 
H™{z) = zG u (H™(z), 1, fl£»(z)), (Mb) 
H in (z) = zG(Hi n (z),l,H™(z)). (14c) 

The probability that a node has a finite in-component is H m (l), so the proba- 
bility that a randomly chosen node is in the GOUT is 1 — H m (l). The expected 
size of the in-component of a randomly chosen node is H m '{\). Power series 
and numerical estimates for H™(z), H™{z), and H m (z) can be obtained by 
iterating these equations. 

The expected size of the out-component of a randomly chosen node below 
the epidemic threshold is H out '(l). Taking derivatives in equation (|14c|) yields 

H m '(l) = 1 + (k d ) + (k u ) iC'(l)- (15) 

Taking derivatives in equations (|14a|) and (Il4bj) and using the fact that H™(1) = 
H" l (l) = 1 in a subcritical network yields 

1 ,0(0,0,1) _ r (o,o,i) 
(1-G^'°' 0) )(1 -g!°' 0,1) ) -Gp'^Gfr '® 

and 

, _ ,0(1,0,0) ^(1,0,0) 

rrinl/-i\ _ 1 >j r tu„ 

« 1 ' (1 _ g[ 1,0 ' 0) )(1 - g1°' 0,1) ) - G^ ' ' 1 ^! 1 ' ' ' ' 
where the argument of all derivatives is (1, 1, 1). 



3.2.3 Epidemic threshold 

The epidemic threshold occurs when the expected size of the in- and out- 
components in the network becomes infinite. This occurs when the denom- 
inators in equations (JT3J) and (fT5|) and equations (HHJ) and (fTTjl approach zero. 
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From the definitions of Gf(x,y,u), G r (x,y,u) and G u (x,y,u), both conditions 
are equivalent to 



(k u ) 



1 




{k d } {k u } 



1 



G (i,o,i) G (o,M) = 



Therefore, there is a single epidemic threshold where the GSCC, the GIN, and 
the GOUT appear simultaneously in both purely directed networks [1,2,13-16] 
and semi-directed networks [3,4]. 

3.2.4 Giant strongly-connected component 

A node is in the GSCC if its in- and out-components are both infinite. A ran- 
domly chosen node has a finite in-component with probability G(H™(1), 1, £f™(l)) 
and a finite out-component with probability G( 1 , H° f ut ( 1 ) , H° ut ( 1 ) ) . The prob- 
ability that a node reached by following an undirected edge has finite in- and 
out-components is the solution to the equation 



and the probability that a randomly chosen node has finite in- and out-components 



is G(H l r n (l), H° ut {\), v) [3]. Thus, the relative size of the GSCC is 

1 - G(flj"(l), 1, - G(l, H° f ut {l), H° u ut {l)) + G(flj"(l), Hf^(l), v). 

4 In- components 



In this section, we prove that the in-component size distribution of the epidemic 
percolation network for the SIR model from [1] is identical to the component 
size distribution of the bond percolation model with bond occupation probability 
T. The probability generating function for the total number of incoming and 
undirected edges incident to any node i is 



which is independent of Tj . If node i has degree rn in the contact network, then 
the number of nodes we can reach by going in reverse along a directed edge or an 
undirected edge has a binomial^, T) distribution regardless of n. If we reach 
node i by going backwards along edges, the number of nodes we can reach from 
i by continuing to go backwards (excluding the node from which we arrived) has 
a binomial^ — 1,T) distribution. Therefore, the in-component of any node 
in the percolation network is exactly like a component of a bond percolation 
model with occupation probability T. This argument was used to justify the 
mapping from an epidemic model to a bond percolation model in [1], but it does 
not apply to the out-components of the epidemic percolation network. 

Methods of calculating the component size distribution of an undirected 
random network with an arbitrary degree distribution using the pgf of its degree 



v = G u (W r n (l),H- 



r(i).«) 



G(x, 1, x\n) - g(g(x, 1, x\ n )) = Q(l — T + Tx) 
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distribution were developed by Newman et al. [2,13-16]. These methods were 
used to analyze the bond percolation model of disease transmission [1] , obtaining 
results similar to those obtained by Andersson [17] for the epidemic threshold 
and the final size of an epidemic. In this paragraph, we review these results 
and introduce notation that will be used in this section. Let Q (u) be the pgf for 
the degree distribution of the contact network. Then the pgf for the degree of a 
node reached by following an edge (excluding the edge used to reach that node) 
is Q\(u) = (n)^ 1 Q' (u), where (n) — Q'{1) is the mean degree of the contact 
network. With bond occupation probability T, the number of occupied edges 
adjacent to a randomly chosen node has the pgf Q(l — T + Tu) and the number 
of occupied edges from which infection can leave a node that has been infected 
along an edge has the pgf Q\ (1 — T + Tu) . The pgf for the size of the component 
at the end of an edge is 

H 1 (z) = zg 1 (l-T + TH 1 (z)) (18) 

and the pgf for the size of the component of a randomly chosen node is 

H (z) = zg{l-T + TH 1 (z)). (19) 

The proportion of the network contained in the giant component is 1 — Hq(1), 
and the mean size of components below the percolation threshold is Hq(1). 
Ho(z) and H\{z) can be expanded as power series to any desired degree by 
iterating equations (|18[) and ([T9]) . and their value for any fixed z 6 [0, 1] can be 
found by iteration from an initial value z Q 6 [0, 1). 

We can now prove that the distribution of component sizes in the bond 
percolation model is identical to the distribution of in-component sizes in the 
epidemic percolation network. 

Lemma 1 G r (x, y, u) — G u (x, y, u) for all x, y, u. 
Proof. From equation (0, 

G r (x,y,u) = — — 1— — G(o.i,o) (a . |y?u) 



1 



TQ'{1) J 



From equation 



Gu(x,y,u) = ^J—G^ix^u) 



1 



TG'{1) Jo 



G'(g(x,y,u\Ti))T Ti dF(Ti). 



Thus, the degree distribution of a node reached by going backwards along an 
edge is independent of whether it was a directed or undirected edge. ■ 
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Lemma 2 H™(z) = H™(z) = Hi(z) for all z. 

Proof. From equations (|14ap and (|14bl) . 

Hi n (z) = zG r (H™{z),l,H*?{z)) 

= zG u {Hl n {z) 1 l,Hi:{z)) = H™{z). 

Let H* n (z) = Hl n (z) = H™{z). Since g(x, 1, x\n) = 1 - T + Tx for all r 2 , 

ffr(z) = TOi) i, ~ T + TH T( z )) T ndF(n) 

= g^-Q'(l-T + THi n (z)). 
From equation (|18p . we have 

ffiW = ^ I yS , (i-T + rijr(z)). 

Since there is a unique pgf that solves this equation, H™(z) = H±(z). Thus, 
the in-component size distribution at the beginning of an edge is the same for di- 
rected and undirected edges, and it is identical to the distribution of component 
sizes at the end of an occupied edge in the bond percolation model. ■ 

Theorem 3 H in (z) = H (z). 

Proof. Let H™{z) = H^ n (z) = H z u n (z). From equation ([l4cj) . the probability 
generating function for the distribution of in-component sizes in the percolation 
network is 

H in (z) = zG(H?(z),l,H?(z)) 

= z / g(g(Hl n (z),l,Hl n (z)\ n ))dF(n) 
Jo 

= Z g{l-T + THt n (z)). 

When Hi(z) is substituted for (which is justified by the previous Lemma), 

this is identical to equation (fT9|) for H$(z) in the bond percolation model. Since 
there is a unique pgf solution to this equation, H m {z) = H (z), so the distribu- 
tion of in-componcnts in the percolation network is identical to the distribution 
of component sizes in the bond percolation model. ■ 

Since the mean size of out-components is equal to the mean size of in- 
components in any semi-directed network, the bond percolation model correctly 
predicts the mean size of outbreaks below the epidemic threshold. Since the 
mean sizes of in- and out-components diverge simultaneously, the bond perco- 
lation model also correctly predicts the critical transmissibility T c . Since the 
probability of having a finite in-component in the percolation model is equal to 
the probability of being in a finite component of the bond percolation model, 
the bond percolation model also correctly predicts the final size of an epidemic. 
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5 Out-components 



In this section, we prove that the distribution of out-component sizes in the epi- 
demic percolation network for the SIR model from [1] is always different than 
the distribution of in-component sizes when there is a nondegenerate distribu- 
tion of infectious periods. As a corollary, we find that the probability of an 
epidemic in the SIR model from the Introduction is always less than or equal 
to its final size, with equality only when epidemics have probability zero or the 
infectious period is constant. This is similar to a result obtained by Kuulasmaa 
and Zachary [18], who found that an SIR model defined on the d-dimensional 
integer lattice reduced to a bond percolation process if and only if the infectious 
period is constant. 

The probability generating function for the total number of outgoing and 
undirected edges incident to a node i with infectious period Tj is 



where T Ti is the conditional probability of transmission across each edge given 
Tj, as defined in equation @. The number of nodes we can reach by going 
forwards along edges starting from i has a Binomial(rii, T Ti ) distribution. If 
we reach a node j by following an edge, then the number of nodes we can 
reach from j by continuing to go forwards (excluding the node from which we 
arrived) has a binomial(fcj — 1, T T .) distribution. Unless 73 is constant, the out- 
components of the epidemic percolation network are not like the components of 
a bond percolation model. 

Suppose i and j are connected in the contact network. The conditional 
transmission probability from j to i given is always T. Thus, an edge across 
which we leave any node is directed (i.e., outgoing) with probability 1 — T 
and undirected with probability T, This allows us to calculate the pgfs of 
the out-component distributions without differentiating between outgoing and 
undirected edges: Let 



be the probability generating function for the degree distribution of a node that 
we reach by going forward along an outgoing or undirected edge (excluding the 
edge along which we arrived). Let 



be the probability generating function for the size of the out-component at the 
end of an outgoing or undirected edge. 



G(l, y, y\ n ) = G(g(l, y, y\n)) = G(l - T Ti + T Ti y), 



G (x, y, u) = (1 - T)G f (x, y, u) + TG u {x, y, u) 

1 r 00 

= 7^ttt/ G'(g(x,y,u\n))dF(n) 




H° ut (z) = (1 - T)H° f ut (z) + TH° ut (z) 
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Lemma 4 For the SIR model from [1], 

H° ut (z) = zG f (l,H: ut (z),H: ut (z)), 
H° u ut (z) = zG u {l,H^ t (z),Hr\z)), 
H™ t (z) = zG(l,H™ t (z),Hr t (z)), 

and we have the following self-similarity equation: 

H°, ut {z) = zG (l,H° ut (z),H° ut {z)). 

Proof. From equation ([3]), we have 

(1 - T)y + Tu, (1 - T)y + Tu\n) = l-T Tf + T Tf [(1 - T)y + Tu] 

= g(i,y,u\Ti) 

for all y, u, and t%. This allows us to rewrite equation (|9a|) : 

Hf\z) = zGf^Hf^zlH^iz)) 

rout ( „\ rrout 



= (i-T)g'(l) J Q'(9^,H^(z) : H^ t (z)\T l ))(\-T T% )dF(r l ) 

= (l-r)g'(i) L ^^H^^HT'iz^nm-TrMFin) 

= zG f {l,H™\z),HZ u \z)). 
Similarly, we can rewrite equation (I9bl) : 

H™\z) = zG u {l,H^{z),H™\z)) 

g'(g(l, H^ t (z),H^ t (z)\ n ))T n dF(n) 
g\g{l,Hr t (z),Hr t {z)\T i ))T Ti dF{r l ) 



TG'(1) Jo 

z 



TQ'O) Jo 
= zG u (l,Hr t (z),Hr t (z)). 

Finally, we can rewrite equation (JTOj) : 

H out (z) = zG(l,H] ut (z),H° ut (z)) 

= z / Q(g(l, Hf Ut (z), H° ut (z)\Ti))dF(Ti) 



g(g(l,Hr t (z),H^ t (z)\r l ))dF(r l ) 
zG(l,H° ut [z),H° ut (z)); 



but then 

T,,, f 



H° ut (z) = (1 - T)m ut (z) + H° ut (z) 



= z[(l-T)G f (l,H: ut (z),H: ut (z))+TG u (l,H: ut (z),H: ut (z))} 
= zG {l,H° u \z),H™\z)). 
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As a corollary, we find that the analysis in Ref. [1] can be corrected if we let 
Gq{x) — G(l,x,x) and G\{x) — G Q (l,x,x) (see equations (13) and (14) in [1]). 



Lemma 5 Hl n (z) < H° ut (z) for all z E [0, 1]. 
Proof. Since Q' is convex, 

H?*(z) = zG (l,H^\z),H^ t (z)) 

f^] o S'(l-T Ti +T Ti H? a (z))dF{n 



Q' 

> ^G'il-T + TH^iz)) 



by Jensen's inequality. Equality holds only if z = 0, H° ut (z) = 1, Q' is constant, 
or Tj is constant. Since H™(z) is the solution to 

H™(z) = -^-g'(l-T + TH?(z)), 

we must have H° ut (z) > iJ™(z). This can be seen by fixing z and considering 
the graphs of y = zG a (l,x, x) and y = g^G'(l - T + Tx). H° ut {z) is the 
value of x at which y = zG (l,x,x) intersects the line y — x. Hl n (z) is the 
value of x at which y = g, z ^ — T + Tx) intersects the line y — x. Since 
zG (l,x,x) > gimG'(l -T + Tx), we must have H° ut (z) > Hl n (z). m 

Theorem 6 H in (z) < H out (z) for all z £ [0,1]. Equality holds only when 
z = 0, z = 1 and the percolation network is subcritical, or the infectious period 
is constant. 



Proof. From equation (|14c|) . 

H in (z) = zG(H™(z),l,Ht n (z)) 
= Z g(l-T + THT(z)). 

From equation (fTQ| . 

H° ut (z) = zG{l,Ht u \z),Ht ut {z)) 

= z Q(l- T Ti + T n H: ut (z))dF(n) 
Jo 

> zQ{\-T + TH° ut (z)) 
> Z g(l-T + THT(z)). 

The first inequality follows from the convexity of g and Jensen's inequality. The 
second follows from the fact that g is nondecreasing and H° ut (z) > H l J L (z). 
Equality holds in both inequalities only if z = 0, g is constant, H™(z) = 1, or 
Ti is constant. ■ 
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Since the probability of an epidemic is 1 — H° ut (l) and the final size of 
an epidemic is 1 — H m (l), it follows that the probability of an epidemic is 
always less than or equal to its final size in the SIR model from [1]. When 
the infectious period is constant, H out (z) = H m (z) for all z G [0, 1], so the in- 
and out-component size distributions are identical and the probability and final 
size of an epidemic are equal. When the infectious period has a nondegenerate 
distribution and the percolation network is subcritical, H out (z) > H m (z) for all 
z G (0, 1) (so the in- and out-components have dissimilar size distributions) but 
H out (l) — H m (l) — 1 (so the probability and final size of an epidemic are both 
zero). If the network is supercritical and the infectious period is nonconstant, 
H out (z) > H m (z) for all z G [0, 1], so in- and out-components have dissimilar 
size distributions and the probability of an epidemic is strictly less than its final 
size. 

Since the bond percolation model predicts the distribution of in-component 
sizes, it cannot predict the distribution of out-component sizes or the proba- 
bility of an epidemic for any SIR model with a nonconstant infectious period. 
However, it does establish an upper limit for the probability of an epidemic 
in an SIR model. We have recently become aware of independent work [19] 
that shows similar results for more general sources of variation in infectiousness 
and susceptibility in a model where these are independent and uses Jensen's 
inequality to establish a lower bound for the probability and final size of an 
epidemic. The lower bound corresponds to a site percolation model with site 
occupation probability T, which is the model that minimized the probability of 
no transmission in the Introduction. 



6 Simulations 

In a series of simulations, the bond percolation model correctly predicted the 
mean outbreak size (below the epidemic threshold), the epidemic threshold, and 
the final size of an epidemic [1]. In Section 4, we showed that the epidemic 
percolation network generates the same predictions for these quantities. 

In Newman's simulations, the contact network had a power-law degree distri- 
bution with an exponential cutoff around degree k, so the probability that a node 
has degree k is proportional to fc~"e~ 1 / K for all k > 1. This distribution was 
chosen to reflect degree distributions observed in real-world networks [1,13-15]. 
The probability generating function for this degree distribution is 

U a (ze-^-) 
G{Z) ~ Li a (e-V«) ' 

where Li a (z) is the a-polylogarithm of z. In [1], Newman used a = 2. 

In our simulations, we retained the same contact network but used a contact 
model adapted from the counterexample in the Introduction. We fixed (3ij = 
Po — 0.1 for all ij and let Tj = 1 with probability 0.5 and Tj = r max > 1 
with probability 0.5 for all i. The predicted probability of an outbreak of size 
one is G(l, 0, 0) in the epidemic percolation network and G(0, 1, 0) in the bond 
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percolation model. The predicted probability of an epidemic is 1 — H out (l) 
in the epidemic percolation network and 1 — H m (l) in the bond percolation 
model. In all simulations, an epidemic was declared when at least 100 persons 
were infected (this low cutoff produces a slight overestimate of the probability of 
an epidemic in the simulations, favoring the bond percolation model). Figures 
and [3] show that percolation networks accurately predicted the probability 
of an outbreak of size one for all (n, k, T max ) combinations, whereas the bond 
percolation model consistently underestimated these probabilities. Figures H] 
and [5] show that the bond percolation model significantly overestimated the 
probability of an epidemic for all (n, n, T max ) combinations. The percolation 
network predictions were far closer to the observed values. 

7 Discussion 

For any time-homogeneous SIR epidemic model, the problem of analyzing its 
final outcomes can be reduced to the problem of analyzing the components of 
an epidemic percolation network. The distribution of outbreak sizes starting 
from a node i is identical to the distribution of its out-component sizes in the 
probability space of percolation networks. Calculating this distribution may 
be extremely difficult for a finite population, but it simplifies enormously in 
the limit of a large population for many SIR models. For a single randomly 
chosen imported infection in the limit of a large population, the distribution of 
self-limited outbreak sizes is equal to the distribution of small out-component 
sizes and the probability of an epidemic is equal to the relative size of the GIN. 
For any finite set of imported infections, the relative final size of an epidemic is 
equal to the relative size of the GOUT. 

In this paper, we used epidemic percolation networks to reanalyze the SIR 
epidemic model studied in [1]. The mapping to a bond percolation model 
correctly predicts the distribution of in-component sizes, the critical transmissi- 
bility, and the final size of an epidemic. However, it fails to predict the correct 
distribution of outbreak sizes and overestimates the probability of an epidemic 
when the infectious period is nonconstant. Since all known infectious diseases 
have nonconstant infectious periods and heterogeneity in infectiousness has im- 
portant consequences in real epidemics [20-22], it is important to be able to 
analyze such models correctly. 

The exact finite-population isomorphism between a time-homogeneous SIR 
model and our semi-directed epidemic percolation network is not only useful be- 
cause it provides a rigorous foundation for the application of percolation meth- 
ods to a large class of SIR epidemic models (including fully-mixed models as well 
as network-based models), but also because it provides further insight into the 
epidemic model. For example, we used the mapping to an epidemic percolation 
network to show that the distribution of in- and out-component sizes in the SIR 
model from [1] could be calculated by treating the incoming and outgoing in- 
fectious contact processes as separate directed percolation processes, as in [19]. 
However, in contrast with [19], the semi-directed epidemic percolation network 
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isolates the fundamental role of the GSCC in the emergence of epidemics. The 
design of interventions to reduce the probability and final size of an epidemic is 
a central concern of infectious disease epidemiology. In a forthcoming paper, we 
analyze both fully-mixed and network-based SIR models in which vaccinating 
those nodes most likely to be in the GSCC is shown to be the most effective 
strategy for reducing both the probability and final size of an epidemic. If the 
incoming and outgoing contact processes are treated separately, the notion of 
the GSCC is lost. 
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A Epidemic percolation networks 

It is possible to define epidemic percolation networks for a much larger class 
of stochastic SIR epidemic models than the one from [1]. First, we specify 
an SIR model using probability distributions for recovery periods in individuals 
and times from infection to infectious contact in ordered pairs of individuals. 
Second, we outline time-homogeneity assumptions under which the epidemic 
percolation network is well-defined. Finally, we define infection networks and 
use them to show that the final outcome of the SIR model depends only on the 
set of imported infections and the epidemic percolation network. 

A.l Model specification 

Suppose there is a closed population in which every susceptible person is as- 
signed an index i £ {1, n}. A susceptible person is infected upon infectious 
contact, and infection leads to recovery with immunity or death. Each person 
i is infected at his or her infection time ti, with ti — oo if i is never infected. 
Person i is removed (i.e., recovers from infectiousness or dies) at time ti + rj, 
where the recovery period ri is a random sample from a probability distribution 
fi(r). The recovery period r, may be the sum of a latent period, when i is 
infected but not yet infectious, and an infectious period, when i can transmit 
infection. We assume that all infected persons have a finite recovery period. 
Let S(t) — {i : ti > t} be the set of susceptible individuals at time t. Let 
t(i) < ^(2) < ••• < t(n) be the order statistics of t\,...,t n , and let i^) be the 
index of the k th person infected. 

When person i is infected, he or she makes infectious contact with person 
j =/= i after an infectious contact interval Tij . Each is a random sample from 
a conditional probability density fij(T\ri). Let = oo if person i never makes 
infectious contact with person j, so fij(r\ri) has a probability mass concentrated 
at infinity. Person i cannot transmit disease before being infected or after 
recovering, so fij{r\ri) — for all r < and all t 6 [n,oo). The infectious 
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contact time tij = t, + Tij is the time at which person i makes infectious contact 
with person j. If person j is susceptible at time tij, then i infects j and tj — tij . 
If < oo, then tj < tij because person j avoids infection at tij only if he or 
she has already been infected. 

For each person i, let his or her importation time toi be the first time at 
which he or she experiences infectious contact from outside the population, 
with toi = oo if this never occurs. Let -Fo(to) be the cumulative distribution 
function of the importation time vector to = (toi, to2, ■ ton)- 

A. 2 Epidemic algorithm 

First, an importation time vector t is chosen. The epidemic begins with 
the introduction of infection at time tm = nrinj(toj). Person im is as- 
signed a recovery period rj (1 . . Every person j € S(tm) is assigned an in- 
fectious contact time t i(1) j — tm + n (1) j. We assume that there are no 
tied infectious contact times less than infinity. The second infection occurs 
at t( 2 ) = minj e 5( t(1) ) min(toj , ti (1)J ), which is the time of the first infectious 
contact after person im is infected. Person i( 2 ) is assigned a recovery period 
rj, 2) . After the second infection, each of the remaining susceptibles is assigned 
an infectious contact time ti (2)J - = t( 2 ) + Ti (2 )j- The third infection occurs at 
t( 3 ) = mim; e 5(- t(2) ) min(t j, tj (1) j, ti (2)J -), an d so on. After k infections, the next 
infection occurs at t(fe+i) = min j£i g(t( fc) ) mm (*0j, t, (1) j, ti (fc) j)- The epidemic 
stops after m infections if and only if t( m -|-i) = oo. 

A. 3 Time homogeneity assumptions 

In principle, the above epidemic algorithm could allow the infectious period and 
outgoing infectious contact intervals for individual i to depend on all information 
about the epidemic available up to time t^. In order to generate an epidemic 
percolation network, we must ensure that the joint distributions of recovery 
periods and infectious contact intervals are defined a priori. The following 
restrictions are sufficient: 

1 . We assume that the distribution of the recovery period vector r = (ri , r 2 , . . . , r n ) 
does not depend on the importation time vector t , the contact interval 
matrix t = [ry], or the history of the epidemic. 

2. We assume that the distribution of the infectious contact interval matrix 
t does not depend on t or the history of the epidemic. 

With these time-homogeneity assumptions, the cumulative distributions func- 
tions F(r) of recovery periods and F(t\y) of infectious contact intervals are com- 
pletely specified a priori. Given r and t, the epidemic percolation network is a 
semi-directed network in which there is a directed edge from i to j iff Tij < oo 
and Tji = oo, a directed edge from j to i iff Tij = oo and Tji < oo, and an 
undirected edge between i and j iff Tij < oo and Tji < oo. The entire time 
course of the epidemic is determined by r, t, and t . However, its final size 



20 



depends only on the set {i : toi < 00} of possible imported infections and the 
epidemic percolation network corresponding to r. In order to prove this, we 
first define the infection network, which records the chain of infection from a 
single realization of the epidemic model. 

A. 4 Infection networks 

Let Vi be the index of the person who infected person i, with v% = for imported 
infections and m — 00 for uninfected nodes. If tied finite infectious contact times 
are possible, then choose Vi from all j such that tj% — ti. The infection network 
has the edge set {vii : < v^ < 00}. It is a purely directed subgraph of the 
epidemic percolation network corresponding to r because r Vii < 00 for every 
edge v^. Since each node has at most one incoming edge, all components of 
the infection network are trees or isolated nodes. Every imported case is either 
the root node of a tree or an isolated node. Every person infected through 
transmission within the population is a nonroot node in a tree. Uninfected 
persons are isolated nodes. 

The infection network can be represented by a vector v = (v\,..,v n ), as in 
Ref. [23]. If Vj = 0, then tj = t j. If < Vj < 00, then j is in a component of 
the infection network with a root node impj and its infection time is 

m 

tj = U mpj + T{ k j k , 
k=l 

where the edges iiji, ■■■,i m jm form a directed path from impj to j. This path 
is unique because all nontrivial components of the infection network are trees. 
If Vj = 00, then tj = 00. The removal time of each node i is ti + T{. If there 
is more than one possible infection network, they must all be consistent with 
(ii, ...,t n ) by definition of Vi. Therefore, the entire time course of the epidemic 
is determined by the importation time vector to, the recovery period vector r, 
and the infectious contact interval matrix r. 

A. 5 Final outcomes and epidemic percolation networks 

Theorem 7 In an epidemic with infectious contact interval matrix t, a node 
is infected if and only if it is in the out- component of a node i with t oi < 00 
in the percolation network. (Equivalently, a node is infected if and only if its 
in-component includes a node i with t ai < 00.) Therefore, the final outcome of 
the SIR model depends only on the set of imported infections and the epidemic 
percolation network corresponding to r. 

Proof. Suppose that person j is in the out-component of a node i with t i < 
00 in the epidemic percolation network corresponding to t. Then there is a 
sequence iiji, ■■■,i m jm such that i\ — i, j m — j, and n k j k < 00 for 1 < k < m, 
so 

m 

tj <t M +^2 T *k3k < °°, 
fc=l 
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and j must be infected during the epidemic. Now suppose that tj < oo. Then 
there exists an imported case i and a sequence iiji, ■■■ 1 i m jm such that i\ = i, 
j m = j, and 

m 

tj = U + T ikjk ■ 
k=l 

Since tj < oo, it follows that Ti k j k < oo for all k. But then the epidemic 
percolation network corresponding to r has an edge with the proper direction 
or an undirected edge between ik and jk for all k, so j is in the out-component 
of i. m 

By the law of iterated expectation (conditioning on r), this result implies 
that the distribution of outbreak sizes caused by the introduction of infection 
to node i is identical to the distribution of his or her out-component sizes in the 
probability space of epidemic percolation networks. Furthermore, the proba- 
bility that person i gets infected in an epidemic is equal to the probability that 
his or her in-component contains at least one imported infection. This isomor- 
phism holds in any finite population. In the limit of a large population, the 
probability that node i is infected in an epidemic is equal to the probability that 
he or she is in the GOUT and the probability that an epidemic results from the 
infection of node i is equal to the probability that he or she is in the GIN. This 
logic can be extended to predict the mean size of self-limited outbreaks and the 
probability and final size of an epidemic for outbreaks started by any given set 
of imported infections. 
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Figure 1: Schematic diagram of the giant components, tendrils, and tubes of 
a supercritical semi-directed network. Adapted from Broder et al. [7] and 
Dorogovtsev et al. [8]. 



23 



Pr(final size = 1 ) for k = 1 
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Figure 2: The predicted and observed probabilities of an outbreak of size one 
on a contact network with k = 10 as a function of r max . Models were run for 
r max = 10, 12, 14, 16, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100. Each 
observed value is based on 10, 000 simulations in a population of size n. For 
n = 10,000, 1,000 simulations were conducted on each of 10 contact networks. 
For n = 1, 000, 100 simulations were conducted on each of 100 contact networks. 
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Pr(final size = 1) forK = 20 
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Figure 3: The predicted and observed probabilities of an outbreak of size one 
on a contact network with k = 20 as a function of r max . Models were run for 
r max = 5, 6, 7, 8, 9, 10, 12, 14, 16, 18, 20, 25, 30, 40, 50, 60, 70, 80, 90, and 
100. Each observed value is based on 10, 000 simulations in a population of size 
n. For n = 10, 000, 1000 simulations were conducted on each of ten contact 
networks. For n = 1000, 100 simulations were conducted on each of 100 contact 
networks. 
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Pr(epidemic) for k = 10 
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Figure 4: The predicted and observed probabilities of an epidemic on a contact 
network with k = 10 as a function of T max . Models were run for r max = 10, 12, 
14, 16, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, and 100. Each observed value 
is based on 10, 000 simulations in a population of size n. For n = 10, 000, 1000 
simulations were conducted on each of ten contact networks. For n = 1000, 
100 simulations were conducted on each of 100 contact networks. 
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Pr(epidemic) for k = 20 
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Figure 5: The predicted and observed probabilities of an epidemic on a contact 
network with n = 20 as a function of r max . Models were run for T max = 5, 
6, 7, 8, 9, 10, 12, 14, 16, 18, 20, 25, 30, 40, 50, 60, 70, 80, 90, and 100. Each 
observed value is based on 10, 000 simulations in a population of size n. For 
n = 10, 000, 1000 simulations were conducted on each of ten contact networks. 
For n = 1000, 100 simulations were conducted on each of 100 contact networks. 
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